Correcting Text

ABSTRACT

Systems and methods are provided for correcting grammatical and spelling errors that involve improper positioning of a whitespace character and/or an extra whitespace character. Removal of an extra whitespace character or repositioning of an improperly positioned whitespace character may result in correction of two misspelled words in a single correction step.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.10/859,734 entitled “Text Correction” and filed Jun. 2, 2004 whichclaims the benefit of U.S. provisional patent application No. 60/475,578entitled “System and Method of Correcting Text” and filed Jun. 2, 2003,the disclosures of both are incorporated herein by reference.

BACKGROUND

1. Field of the Invention

The invention is in the field of computer programming and morespecifically in the field of text processing.

2. Related Art

Text processing and text correction features are found in a wide varietyof computing devices. For example, spelling and grammar correction arefound in most word processing programs, presentation programs, databaseprograms, and the like. It is desirable to make text and grammarcorrection as efficient as possible.

In current correction algorithms a single mistake that involves twowords requires two separate corrections. For example, “spellingm istake”is a single misplacement of a whitespace character but requires separatecorrection steps to correct both “spellingm” and “istake.” A similarproblem occurs with mistakes such as “spelli ng,” which involves anextra whitespace character. In this case, a first correction step isrequired to replace “spelli” with “spelling” and a second correctionstep is required to eliminate “ng.”

Whitespace characters include a space, a tab, a carriage return, and thelike used to separate non-whitespace characters. In some embodiments,these characters the ASCII characters represented by decimal values 32,10, 11, 12, 13, or the like.

SUMMARY

The invention includes systems and methods of correcting text errorsincluding those involving whitespace characters. In various embodiments,errors resulting from extra and/or misplaced whitespace characters arecorrected in a single step. In some cases the single step results in thecorrection of two misspelled words.

Various embodiments of the invention include a method of correctingtext, the method comprising detecting a word including a spelling error,testing to see if shifting a character to or from a first adjacent wordin a first direction solves the spelling error, the first adjacent wordbeing adjacent to the word including the spelling error, and resolvingthe spelling error responsive to the testing.

Various embodiments of the invention include a method of correctingtext, the method including detecting an error in the text, testing tosee if moving a location of a first whitespace character solves theerror, and resolving the error responsive to the testing.

Various embodiments of the invention include a system for textprocessing, the system comprising memory configured to store text, adisplay configured to display the text, and computer instructionsconfigured to identify and resolve an error resulting from improperlypositioned whitespace character within the text, the resolutionincluding a single replacement step involving replacing one or morewords in the text with a correction candidate.

Various embodiments of the invention include a system for textprocessing comprising means for detecting a spelling error including amisplaced whitespace character, means for identifying a solution to thespelling error, and means for resolving the spelling error responsive tothe solution.

Various embodiments of the invention include a system for textprocessing comprising means for detecting an error in text, the errorincluding an extra whitespace character, means for identifying asolution to the error including eliminating the extra whitespacecharacter, eliminating the extra whitespace character resulting in areduction of a total number of words in the text, and means forresolving the error responsive to the solution.

Various embodiments of the invention include a computer readable mediumincluding computer instructions, the computer instructions comprising acode segment configured for detecting an error in text, a code segmentconfigured for testing to see if moving a location of a first whitespacecharacter solves the detected error, and a code segment configured forresolving the detected error responsive to the testing.

Various embodiments of the invention include a method of correctingtext, the method comprising detecting a word including a spelling error,testing if elimination of a whitespace character resolves the spellingerror without creating a new spelling error, and resolving the spellingerror responsive to the testing.

Various embodiments of the invention include a method of correctingtext, the method comprising detecting an extra whitespace characterwithin the text, the extra whitespace character positioned between twowords, testing if eliminating of the extra whitespace character resultsin converting the two words to one word, the one word being correctlyspelled, and correcting the text responsive to the testing.

Various embodiments of the invention include a method of correctingtext, the method comprising detecting a spelling error, determining afirst correction candidate configured to replace one word in the text,determining a second correction candidate configured to replace twowords in the text, displaying the first correction candidate and thesecond correction candidate to a user, and correcting the spelling errorresponsive to a selection received from the user, the selection being ofthe first correction candidate or the second correction candidate.

BRIEF DESCRIPTION OF THE VARIOUS VIEWS OF THE DRAWING

FIG. 1 is a block diagram illustrating a computing system, according tovarious embodiments of the invention;

FIG. 2 illustrates a method of the correcting an error, according tovarious embodiments of the invention;

FIG. 3A and FIG. 3B each illustrate a different display used to presenta correction list to a user, according to various embodiments of theinvention; and

FIG. 4 illustrates sub-steps within a resolve error step, according tosome embodiments of the invention.

DETAILED DESCRIPTION

The invention includes systems and methods for correcting spelling andgrammatical errors involving improper positioning and or addition of anextra whitespace character. Examples of these errors corrected include:“makin g mistake” wherein an extra space is found between the “n” andthe “g”; “making m istake” wherein an extra space is found between the“m” and the “i”; “makin gmistake” wherein the position of a space isincorrectly shifted to the right; “makingm istake”; wherein the positionof a space is incorrectly shifted to the left. These errors may causeone or two spelling errors. For example, “I scorrect” includes onespelling error and “makin gmistake” includes two spelling errors.

In various embodiments, the error resulting from an additional space oran improperly positioned space results in a grammatical error. In someof these embodiments there is a grammatical error but not a spellingerror. For example, if “It is one car seat” is incorrectly written as“It is one cars eat.” there is a grammatical error but no spelling errorresulting from an improperly positioned space.

Correction of these errors, grammatical or spelling, includesrepositioning an improperly positioned whitespace character and/orremoving an extra whitespace character. In some cases one spelling erroris corrected and in some cases two errors are corrected in a singlecorrection step. In some embodiments, the correction includesreplacement of one word or replacement of two words, where a word is agroup of non-whitespace characters between whitespace characters. Forthe purposes of this disclosure and claims, “are two” and “ar etwo?” aretwo words each. “Are three words” and “are t here” are three words. Insome embodiments, correction of an error includes replacing two wordswith one word. In some embodiments, correction of an error includesreplacing two words with two words.

FIG. 1 is a block diagram illustrating a Computing System 100, accordingto various embodiments of the invention. Computing System 100 includes aDisplay 110 configured to view text as well as an optional User Input120 configured for a user to input text and/or select options presentedon Display 110. Computing System 100 further includes Storage 130,configured to store, for example, Correction Code 140 and Text 150. AProcessor 160 is configured to execute Correction Code 140 and controlDisplay 110.

In various embodiments, Display 110 includes a computer screen, personaldigital assistant display, an electronic book display, a video display,a communication device display, a telephone display, or the like.

In various embodiments, User Input 120 includes a button, a keyboard,graphical user interface, handwriting recognition device, touchsensitive device, or the like. In various embodiments, User Input 120includes a digital data input device such as a communications port(e.g., an Ethernet, a serial or parallel port), a memory interface, adrive (e.g. hard drive, floppy drive, compact disk drive, digitalversatile disk drive), or the like. In these embodiments, User Input 120is used to transfer text data to Computing System 100. For example, inone embodiment User Input 120 includes a compact disk drive configuredto read text data from a compact disk.

In various embodiments, Storage 130 includes digital (fixed orremovable) memory such as RAM, SRAM, compact disk, digital versatiledisk, floppy disk, hard drive, or the like. Storage 130 is configured tostore Correction Code 140 and Text 150, and optionally a Word Set 170and/or Grammar Rules 180. In some embodiments, Storage 130 isdistributed among several devices.

Correction Code 140 includes computer code configured to perform methodsof the invention as described further herein. For example, in variousembodiments, Correction Code 140 includes code configured to detect anerror including an improperly positioned whitespace character and/orcode configured to detect an error including an extra whitespacecharacter positioned within what should be one word. Further, in variousembodiments, Correction Code 140 includes code to offer a user one ormore possible correction candidates configured for correcting errorsinvolving a whitespace character.

In various embodiments, Text 150 is character-based text including analphabet of characters used to form words separated by whitespace. Text150 is typically received by Computing System 100 using User Input 120.

Word Set 170 is a set of words whose spelling is considered to becorrect. For example, Word Set 170 may be a predefined or a user defineddictionary. Grammar Rules 180 is a set of grammatical rules configuredfor detecting errors in grammar.

FIG. 2 illustrates a method of the correcting an error in Text 150,according to various embodiments of the invention.

In a Detect Error Step 210, an error is detected in Text 150, usingCorrection Code 140. In some embodiments, Detect Error Step 210 includesparsing Text 150 and comparing words found in Text 150 with words inWord Set 170. In some embodiments, Detector Error Step 210 includesparsing Text 150 and comparing Text 150 with grammatical rules stored inGrammar Rules 180. When a word in Text 150 does not match a word in WordSet 170, the word in Text 150 is flagged as including an error.

In some embodiments, Detect Error Step 210 includes monitoring theaddition of new text, using User Input 120, to Text 150. As new wordsare added or old words are modified, they are compared with words inWord Set 170. When a new or altered word in Text 150 does not match aword in Word Set 170, the word in Text 150 is flagged as including anerror. In some embodiments, Detect Error Step 210 includes parsing Text150 for grammatical errors. In these embodiments, a word or words ofText 150 are analyzed using Grammar Rules 180. If the word or words donot fit within the grammatical rules then the words are flagged asincluding an error.

In some embodiments, Detect Error Step 210 includes systems and methodsused by the software programs Microsoft Word®, Microsoft PowerPoint®,Microsoft Outlook®, Microsoft Excel®, Lotus-123®, EMACS, WordPerfect®,Microsoft Access®, Microsoft Visio®, Microsoft FrontPage®, or the like,to detect an error in text.

In various embodiments Detect Error Step 210 includes detecting theerrors such as “wron gword,” “wrongw ord,” “wro ngword,” “wrong w ord,”“wro ng word,” or the like. These errors reflect a character orcharacters incorrectly shifted to the right, a character incorrectlyshifted to the left, and an incorrect extra whitespace character,etcetera.

In a Test Exchange Step 220, Correction Code 140 is used to determine ifan error, flagged in Detect Error Step 210, is corrected by exchangingone or more letter between adjacent words, moving a whitespacecharacter, eliminating a whitespace character, or like action. An erroris considered corrected if the words in the corrected text would matchwords in Word Set 170 following the action. There may be more than onepossible correction for an error. Each possible correction is considereda “correction candidate.” In some cases, exchanging one or more letterbetween adjacent words is equivalent to moving a whitespace character.Also, in some cases, removing and then adding a whitespace character isequivalent to moving a whitespace character. These equivalencies aremeant to be included in the term “moving a whitespace character” in thisdisclosure and claims.

In some embodiments of Test Exchange Step 220, Error Correction Code 140tests for a character (whitespace or non-whitespace) incorrectly shiftedto the right. For example, in one embodiment Error Correction Code 140tests to see if “wron gword” is corrected by shifting the “g” back toproduce “wrong word,” or if “wro ngword” is corrected by shifting the“ng” back to produce “wrong word.”

In some embodiments of Text Exchange Step 220, Error Correction Code 140is configured to test for a character incorrectly shifted to the left.For example, in one embodiment, Error Correction Code 140 tests to seeif “wrongw ord” is corrected by shifting the “w” forward to produce“wrong word.” In some embodiments, similar tests are applied to theerrors “wrong w ord” and “wron g word,” (each including three words). Inthese tests the shifted character is the only letter of the words “w”and “g” and a correction candidate, such as “wrong word” (including thegrammatical error of having two spaces between the words) or “wrongword” (including one space between the words), include fewer total wordsthan the text with the error. (In some embodiments having two spacesbetween words is considered a grammar error.) In cases such as these, aspace is optionally deleted so that there is not a double space betweenthe remaining words. In these embodiments, the correction candidateincludes fewer words than the original text.

In some embodiments of Text Exchange Step 220, Error Correction Code 140is configured to test for an extra whitespace character. For example, inone embodiment, Error Correction Code 140 tests to see if “wro ng” iscorrected by eliminating a whitespace character next to a word flaggedas having an error, to produce “wrong.” In these embodiments, thecorrection candidate includes fewer words than the original text.

Test Exchange Step 220 optionally includes a plurality of tests, such asthose described herein. Typically, a series of tests will be made untilone or more possible correction candidates, such as “wrong word,” arefound.

In some embodiments of Test Exchange Step 220, correction candidates aregiven a greater relevancy if they correct two spelling mistakes at onceinstead of just correcting one spelling mistake. For example, the phrase“wron gword” includes two words that typically would be flagged hashaving a spelling error in Detect Error Step 210. The correctioncandidate “wrong word” corrects both of these spelling errors and thusmay be given greater relevancy than a correction candidate that dealtwith only one of the words at a time. In other words, Correction Code140 could either deal just with correcting the word “wron” and arrive atcorrection candidates such as “worn,” “wrong,” “wren” and/or CorrectionCode 140 could apply the tests of Test Exchange Step 220 and find thecorrection candidate “wrong word.” Since the correction candidate “wrongword” corrects two errors it may be given greater relevancy.

In a Resolve Error Step 230 the one or more correction candidates foundin Test Exchange Step 220 are optionally used to correct Text 150 byreplacing the text including the error found in Detect Error Step 210with one of the correction candidates found in Text Exchange Step 220.For example, in an embodiment wherein Text 150 includes the sentence,“Be sure not to use the wrongw ord in a patent claim” and a correctioncandidate is “wrong word,” Resolve Error Step 230 may include correctingText 150 to read “Be sure not to use the wrong word in a patent claim.”In this case two spelling errors are corrected using a single correctioncandidate in a single correction step. In some embodiments, thereplacement is made automatically using Correction Code 140. In one ofthese embodiments the correction candidate with the greatest relevancyis used to correct the error.

In some embodiments of Resolve Error Step 230, the correction is made byoffering the one or more possible correction candidates to a user, andrequesting that the user choose a preferred correction candidate fromamong the offered correction candidates. For example, in response to theerror “wron gword” the user may be offered the following list ofcorrection candidates: “wrong word,” “wrong,” “wren,” and “worn.” If theuser then selects a preference, the selected correction candidate isused to replace the text including the error found in Detect Error Step210. In some embodiments, the total number of words in Text 150 isreduced in Resolve Error Step 230. For example, when “wrong w ord”(three words) is replaced by “wrong word”(two words), one word iseliminated. When the correction candidate is selected by Correction Code140 based on both the first and second error, the correction candidate(e.g., “wrong word”) may include two words. Thus, both errors may becorrected in a single correction step, e.g., through one replacementstep involving a single correction candidate. Alternatively, when thecorrection candidate is selected by Correction Code 140, based on onlythe first error then only the first error is typically corrected. Thus,in alternative embodiments, selecting “wrong” as the correctioncandidate responsive to the error “wron gword” would result in acorrection to “wrong gword” an a second correction step would berequired to correct “gword.”

In some embodiments, the list of correction candidates displayed to auser includes indication of which correction candidates would be used tochange two words and which correction candidates would be used to changeone word. FIG. 3A and FIG. 3B each illustrate a different embodiment ofDisplay 110 used to present a correction list to a user. Part of Display110 indicates Text To Be Replaced 310, part of Display 110 indicatesCorrection Candidates 320, and part of Display 110 presents User Options330. Text To Be Replaced 310 optionally includes a phrase having severalwords (e.g. “wron gword”). In FIG. 3B one of the Correction Candidates320 is “Burma shave.” In various embodiments, this correction candidateis arrived at from the phrase “Burm as have” having an error. Forexample, in Test Exchange Step 220, Correction Code 140 may first dealwith the word “Burm” alone. Correction Candidates 320 such as “burn” arefound. Correction Code 140 then tests for errors involving thepositioning of a whitespace character and/or exchanging characters withadjacent words, and finds “Burma s” as a correction candidate. However,this correction candidate generates a new error, “s.” Correction Code140 then tests for correction candidates including the next word andfinds that by moving another character a correction candidate, “Burmashave,” without errors is found. This multi-step process considers firstone word, then two, then three words in search for Correction Candidates320.

Returning to Resolve Error Step 230 of FIG. 2, the preferred CorrectionCandidate 320 is optionally displayed to the user, and if selected bythe user is used to replace the text including an error.

FIG. 4 illustrates sub-steps within an embodiment of the singlecorrection step Resolve Error Step 230.

In a Display Correction List Sub-step 410 a list of identifiedCorrection Candidates 320 is shown to a user using Display 110. Asillustrated in FIG. 3A and FIG. 3B, this list is typically associatedwith a display of Text To Be Replaced 310. There may be more than oneinstance of Text To Be Replaced 310 displayed at the same time. Forexample, in FIG. 3A there are two instances of Text To Be Replaced 310,“wron gword” and “wron.” Thus, a user is given more than one option ofwhich text is to be corrected.

In a Receive User Selection Sub-step 420, Processor 160 receives aselection of one of the displayed Correction Candidates 320 from a user.By selecting a preferred Correction Candidate 320 the user may also bespecifying which of Text To Be Replaced 310 is to be replaced by theCorrection Candidate 320. For example, if a user selects the CorrectionCandidate 320 “wrong word” in FIG. 3A, then the selected CorrectionCandidate 320 will be used to replace Text To Be Replaced 310 “wrongword.” In contrast, if the user selects the Correction Candidate 320“wrong,” then the selected Correction Candidate 320 will be used toreplace Text To Be Replaced 310 “wron.”

In a Substitute Correction Candidate Sub-step 430 the selectedCorrection Candidate 320 is used to replace the associated Text To BeReplaced 310.

Embodiments of the invention include looking for sequential words witherrors and attempting to correct the errors with a single replacementstep. Some of these embodiments include offering to the user suggestedcorrection candidates that include more than one word configured toreplace more than one word. Some of these embodiments include offeringthe user a suggested correction candidate that corrects more than oneerror.

Several embodiments are specifically illustrated and/or describedherein. However, it will be appreciated that modifications andvariations are covered by the above teachings and within the scope ofthe appended claims without departing from the spirit and intended scopethereof. For example, while the discussion of the invention uses the“space character” as an example of a whitespace character, it is withinthe scope of the invention to correct errors involving other whitespacecharacters, (e.g., a tab character).

The embodiments discussed herein are illustrative of the presentinvention. As these embodiments of the present invention are describedwith reference to illustrations, various modifications or adaptations ofthe methods and or specific structures described may become apparent tothose skilled in the art. All such modifications, adaptations, orvariations that rely upon the teachings of the present invention, andthrough which these teachings have advanced the art, are considered tobe within the spirit and scope of the present invention. Hence, thesedescriptions and drawings should not be considered in a limiting sense,as it is understood that the present invention is in no way limited toonly the embodiments illustrated.

1. A system for text processing, the system comprising: memoryconfigured to store text; a display configured to display the text; andcomputer instructions configured to identify a spelling error resultingfrom an improperly positioned whitespace character within the text, theimproperly positioned whitespace having been entered by a user, andresolve the spelling error by moving the improperly positionedwhitespace character from a first position to a second position.
 2. Thesystem of claim 1, wherein a correction candidate to the spelling errorincludes two words.
 3. The system of claim 1, wherein the computerinstructions are further configured to resolve the spelling error byreplacing two or more words in the text with a correction candidate. 4.The system of claim 1, wherein the display comprises a display of apersonal digital assistant or a personal computer.
 5. A computerreadable medium including computer instructions, the computerinstructions comprising: a code segment configured for detecting aspelling error in text; a code segment configured for testing whethermoving a first whitespace character from a position as entered by a usersolves the detected spelling error; and a code segment configured forresolving the detected spelling error responsive to the testing.
 6. Amethod of correcting text, the method comprising: detecting a wordincluding a spelling error; testing to determine if shifting a letter toor from a first adjacent word in a first direction solves the spellingerror, the first adjacent word being adjacent to the word including thespelling error, and if shifting a letter to or from the first adjacentword in a second direction solves the error; and resolving the spellingerror responsive to the testing.